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* NOTICES * 

Japan Patent Office is not responsible for any 
damages caused by the use of this translation, 

1 . This document has been translated by computer. So the translation may not reflect the original precisely. 
2 **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



CLAIMS 
[Ciaim(s)] 



[Claim 1] The information-analysis method characterized by providing the following. The step which is the 
information-analysis method using the correlation rule discovery technique, analyzes the aforementioned additional 
information and acquires the 2nd analysis result information while verifying the existing analysis result information for the 
aforementioned additional information and acquiring the 1st analysis result information, when additional information is 
inputted. The step which compounds the aforementioned 1st analysis result information and the 2nd analysis result 
information, and generates the 3rd analysis result information. 

[Claim 2] The information-analysis method according to claim 1 characterized by including the step saved as analysis result 
information that the information which specifies the information and accumulation frequency which specify the time which 
analyzed with the aforementioned 2nd analysis result information is used at the time of the next information addition. 
[Claim 3] The information-analysis method characterized by providing the following. The step which is the 
information-analysis method using the correlation rule discovery technique, analyzes the aforementioned additional 
information and searches for the 2nd analysis result information while verifying the existing analysis result information for 
additional information and searching for the 1st analysis result information, when information is added and deleted. The step 
which compounds the analysis result information and the aforementioned 2nd analysis result information which reduce the 
analysis result information which should be deleted from the aforementioned 1st analysis result information, and are acquired, 
and generates the 3rd analysis result information. 

[Claim 4] Information-analysis equipment characterized by providing the following. A means to be information-analysis 
equipment using the correlation rule discovery technique, and to input additional information. A means to verify the existing 
analysis result information for the aforementioned additional information, and to generate the 1 st analysis result information 
when the aforementioned additional information is inputted. A means to analyze the aforementioned additional information and 
to generate the 2nd analysis result information. A means to compound the aforementioned 1st analysis result information and 
the aforementioned 2nd analysis result information, and to generate the 3rd analysis result information. 
[Claim 5] Information-analysis equipment according to claim 4 characterized by including a means to save as analysis result 
information that the information which specifies the information and accumulation frequency which specify the time which 
analyzed with the aforementioned 2nd analysis result information is used at the time of the next information addition. 
[Claim 6] Information-analysis equipment characterized by providing the following. A means to be information- analysis 
equipment using the correlation rule discovery technique, to verify the existing analysis result information for additional 
information, and to acquire the 1st analysis result information when information is added and deleted. A means to analyze the 
aforementioned additional information and to acquire the 2nd analysis result information. A means to compound the analysis 
result information and the aforementioned 2nd analysis result information which reduce the analysis result information which 
should be deleted from the aforementioned 1st analysis result information, and are acquired, and to generate the 3rd analysis 



result information. 
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Detailed Description of the Invention] 
0001] 

The technical field to which invention belongs] this invention relates to the information-analysis method and equipment which 
used the correlation rule discovery technique. 
[0002] 

Pescription of the Prior Art] Data mining attracts attention as technology of extracting knowledge from large-scale data 
**-SU. As the technique of data mining, various technique, such as a decision tree, a neural network, correlation rule 
discovery, and clustering, is proposed. The feature hidden in data **-SU by such technique is extracted, and the application to 
various fields, such as marketing, is expected. 

[0003] By the basic system, a snapshot is taken periodically and data **-SU generally made into the object of mining uses not 
the thing under employment but the thing built as another data **-SU (data warehouse). Therefore, usually renewal of data 
**-SU is performed by adding collectively the data which were not reflected in real time but were added after a fixed period. 
For this reason, whenever addition of periodical data is performed for grasping the inclination covering whole data **-SU, it is 
necessary to perform mining about whole data **-SU. Data **-SU set as the object of mining is huge in many cases, and has 
taken the great execution time to perform mining about whole data **-SU at each time which is addition of data. 
[0004] Correlation rule discovery is one of the typical mining technique, and is used as the technique of performing basket 
analysis in retail trade. Basket analysis is technique as which a customer analyzes the group of the item simultaneously 
purchased by one transaction, for example, the correlation rule "the customer who buys beer also buys a disposable diaper 
simultaneously" can be discovered. This processing is performed by the following procedures. 
[0005] 1 : Ask for the frequency of occurrence according to an item about all transactions. 
2: The frequency of occurrence removes the item below the minimum support value. 

3 : Carry out the self join (SELF JOIN) of this table, and ask for the simultaneous frequency of occurrence of two items. 

4: The frequency of occurrence removes the item below the minimum support value. 

5: Generate the correlation rule beyond the minimum confidence value about the pair of the extracted item. 

[0006] Furthermore, this is repeated and a correlation rule is similarly generated about the group of three or more items. In 

addition, a user does initial setting of the minimum support value and the minimum confidence value, and a support value and a 

confidence value are defined as follows about the correlation rule of the form {A 1 . A2 -An} ->B. 

[0007] Support value = (A1.A2 - number of times of an appearance of An and B) all /number confidence value of transactions 
=(ahimmum.A2 — number of times of appearance of An and B)/(number of times of an appearance of A 1 . A2 —An) 
The ****** rule between items with the high frequency of occurrence is extracted using these two. 
[0008] 

[Problem(s) to be Solved by the Invention] It is necessary to search whole data **-SU in the former in quest of the frequency 
of occurrence according to item, and the frequency of occurrence of the group of an item. Or when the mdex is created for 
every item, it is necessary to search the whole index. Moreover, when a large number [ the item beyond the minimum support 
value ], the processing which self join operation takes becomes huge. Thus, by correlation rule discovery, the great processing 
time is taken to analyze to large-scale whole data **-SU. 

[0009] That is, whenever the content of data **-SU was added, mining needed to be again performed over whole data **-SU, 
and the conventional method had taken the great processing time each time. 

[0010] Therefore, the purpose of this invention is by using the information-analysis (mining) result performed before the 
information analysis (mining) only about the portion to which data **-SU was added, and the informational addition to offer 
the information-analysis method and equipment which extract efficiently the feature included in the content of the newest data 
**-SU. 
[0011] 

[Means for Solving the Problem] When this invention is the information-analysis method which used the correlation rule 
discovery technique and additional information is inputted, The step which analyzes the aforementioned additional information 
and acquires the 2nd analysis result information while verifying the existing analysis result information for the aforementioned 
additional information and acquiring the 1st analysis result information, The aforementioned 1st analysis result information and 
the 2nd analysis result information are compounded, and the information-analysis method characterized by having with the 
step which generates the 3rd analysis result information is offered. 

[0012] When it is the information-analysis method by which the correlation rule discovery technique was used for this 
invention and information is added and deleted, The step which analyzes the aforementioned additional information and 
searches for the 2nd analysis result information while verifying the existing analysis result information for additional 
information and searching for the 1st analysis result information, The analysis result information and the aforementioned 2nd 
analysis result information which reduce the analysis result information which should be deleted from the aforementioned 1st 
analysis result information, and are acquired are compounded, and the information-analysis method characterized by 
generating the 3rd analysis result information is offered. 

[0013] Especially, when information is added in correlation rule discovery, this invention carries out mining only of the 
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additional information, and generates additional information mining information. Verify a correlation rule using the 
aforementioned additional information to the mining information on the past obtained from mining of the information before 
an information addition, and the mining information on additional information is compounded to the minin g information on 
past according to this verification result. The incremental information mining method characterized by generating the mining 
result of whole data **-SU including additional information is offered. ° 
[0014] A means for this invention to be information-analysis equipment which used the correlation rule discovery technique, 
and to input additional information, A means to verify the existing analysis result information for the aforementioned 
additional information, and to generate the 1st analysis result information when the aforementioned additional information is 
inputted, A means to analyze the aforementioned additional information and to generate the 2nd analysis result information, 
and the aforementioned 1st analysis result information and the aforementioned 2nd analysis result information are 
compounded, and the information-analysis equipment characterized by providing a means to generate the 3rd analysis result 
information is offered. 

[0015] A means for this invention to verify the existing analysis result information for additional information when it is 
information-analysis equipment which used the correlation rule discovery technique and information is added and deleted, and 
to acquire the 1st analysis result information, A means to analyze the aforementioned additional information and to acquire the 
2nd analysis result information, the analysis result information which reduces the analysis result information which should be 
deleted from the aforementioned 1st analysis result information, and is acquired, and the aforementioned 2nd analysis result 
information are compounded. The information-analysis equipment characterized by providing a means to generate the 3rd 
analysis result information is offered. 

[0016] A means by which this invention adds information in correlation rule discovery, and a new mining means to carry out 
mining only of the additional information, to extract, and to generate the 1st mining result information, A verification means to 
verify the past mining result information acquired by mining of the information before adding using the aforementioned 
additional information, and to generate the 2nd mining result information, The mining result information on the above 2nd and 
the mining result information on the above 1st which are acquired by this verification means are compounded. The incremental 
information minin g equipment characterized by consisting of synthetic meanses to generate the mining result of whole data 
**-SU including the aforementioned additional information is offered. 

[0017] According to this invention, the feature included in the content of the newest data **-SU is efficiently extracted by 
using the mining result which performed mining only about additional information and was performed before the 
informational addition. Therefore, when information is added, it is not necessary to deal with whole large-scale data **-SU, 
and it becomes possible to accelerate sharply the information mining operation performed daily. 
[0018] 

[Embodiments of the Invention] Drawing 1 shows the structure of a system which realizes the incremental data-mining method 
of this invention. According to this, the past mining system and the new mining system are shown. A past mining system 
contains the original database 1 1 and the past mining section 12. The original database 1 1 stores the item data of a large 
number collected in the past, and the past mining section 1 2 performs mining to the past data, and it generates the past mining 
result 13. 

[0019] A new mining system is constituted by the additional data generating section 21, the new mining section 22, the 
verification section 23, and the synthetic section 24. The output of the additional data generating section 21 is connected to the 
new mining section 22 and the verification section 23, and the output of the new mining section 22 and the verification section 
23 is connected to the synthetic section 24. 

[0020] Although (he new mining section 22 performs the same processing as the conventional mining, it performs mining only 
about the additional data instead of whole data **-SU. Therefore, ruining processing can accelerate sharply compared with the 
former. The verification section 23 verifies whether the past mining result is succeedingly materialized also to present data 
**-SU. Specifically, this verification section 23 verifies whether it is realized to additional data as a result of [ past ] mining 
(i.e., the past correlation rule). The synthetic section 24 generates information required for judgment of the verification section 
in next mining while compounding and outputting the result of the new mining section 22 and the verification section 23. 
[0021] It is easier to verify whether generally mining of the strange data is carried out, and the knowledge extracted in the past 
rather than it extracted knowledge is applied to present. For example, in correlation rule discovery, if the group of an item is 
assumed as knowledge extracted in the past and these will count the frequency which exists in additional data, it is easily 
verifiable whether the past mining result is applied to additional data. For this reason, it becomes accelerable [ mining to whole 
data **-SU containing the added data ]. 

[0022] (1st operation gestalt) The incremental data-mining method of the 1st operation gestalt of this invention is explained. 
First, the past mining system which performs data mining about four transactions is explained, referring to the flow chart of 
drawing 2 . In this example, each transaction is equivalent to one purchase of a consumer, and a unique identification number 
(ill)) is given. In this case, a transaction carries out to four of 100,200,300,400. A, B, C, D, and E express each item. The list 
of items purchased for every transaction is assumed to be what is shown in Table 1 . 
[0023] Table 1 TID Item list 100 (A, C, D) 
200(B,C,E) 
300 (A, B, C,E) 
400 (B,E) 

If the above-mentioned item list is read from the original database 1 1 (SI 1) and is sent to the past mining section 12, the 
frequency of occurrence for every item will be called for after this (S 1 2). The frequency of occurrence obtained at this time is 
shown in Table 2, 

[0024] Table 2 item The frequency of occurrence A 2B 3C 3D IE 3 - here, the minimum support value is set to 0.3 and the 
low item of frequency is removed (S 1 3) That is, since the number of transactions is 4, the frequency of occurrence removes 
less than 1 .2 thing. Here, Item D is removed. Self joint is performed about four items which remained (S 1 4), and the group of 
an item is generated, then, the original transaction data ~ the frequency of occurrence of an item group — asking (SI 5) — the 
frequency of occurrence of an item group becomes as it is shown in Table 3 

[0025] A table 3 item group The frequency of occurrence (A, B) 1 (A, C) 2 (A, E) 1 (B, C) 2 (B, E) 3 (C, E) 2 - in this, since 
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the frequency of occurrence is under the minimum support value (1.2), (A, B), and (A, E) are removed (SI 6) Since two or 
more item groups are obtained, after removal continues processing (SI 7). That is, processing returns to Step S 14 and the self 
join of a diad is taken (SI 4). Thereby, three groups of an item are generated, if it asks for the frequency of occurrence from 
transaction data, the frequency of occurrence of an item group (B, C, E) is 2 - understanding - other than this - being alike ~ 
it turns out that there is no solution A loop is ended here (S 1 7). 

[0026] What is necessary is for a confidence value just to decompose the element of the group of an item into the left part and 
the right-hand side of a rule, in order to generate a correlation rule using the item group detected by the processing so far. 
[0027] Confidence value = (number of times of an appearance of left part and the right-hand side) since / (number of times of 
an appearance of left part) defines, it is set to confidence value =1/3 of confidence value =1-/2B->A of A->B if it attaches, for 
example (A, B). It becomes the correlation rule by which the thing beyond the minimum confidence value is generated from 
these. That is, the thing beyond the minimum confidence value is outputted as a mining result (SI 8). In addition, the portion 
which serves as a bottleneck on processing in this algorithm is a portion which asks for the item group beyond the minimum 
support value, and targets even the place which outputs the item group below the minimum support value as a mining result. 
Therefore, as shown in Table 4, let the mining result about this example be each frequency of occurrence with an item group 
[0028] Table 4 item group Frequency of occurrence (A, C) 2 (B, C) 2 (B, E) 3 (C, E) 2 (B, C, E) Operation of the new mining 
section is explained about the case where there are 2, next additional data, referring to the flow chart of drawing 3 . The 

additional data to above-mentioned data **-SU shall be shown in Table 5. ~~ 

[0029] Table 5 TID Item list 500 (A, B, C) 
600 (A, C, E) 
700 (B, E,F) 
800(A,B,F) 

An input of this additional data asks for the frequency of occurrence about this additional data (S22). (S21) The frequency of 
occurrence obtained at this time is shown in Table 6. 

[0030] Table 6 item The frequency of occurrence A 3B 3C 2E 2F 2 - here, the minimum support value is set to 0.3 and the 
low item of frequency is removed (S23) That is, since the number of transactions is 4, the frequency of occurrence removes 
less than 1 .2 thing. Here, since there is no item for removal, self joint is performed about five items (S24), and an item group is 
generated, then, the original transaction data - the frequency of occurrence of an item group - asking (S25) « the frequency of 
occurrence of an item group becomes as it is shown in Table 7 

[003 1] Table 7 item The frequency of occurrence (A, B) 2 (A, C) 2 (B, F) 2 (E, F) 1 - in this, since the frequency of 
occurrence is under the minimum support value, (E, F) are removed (S26) Thereby, three item groups are generated. It turns 
out that it turns out that the frequency of occurrence of these item group is 2 when it asks for the frequency of occurrence from 
transaction data, and there is no solution in addition to it. A loop is ended here (SI 7). And the group of the item beyond the 
minimum support value is chosen (S28). Thereby, the item group shown in Table 8 and its frequency of occurrence are 
obtained. This is equivalent to the result related only with additional data. 

[0032] Table 8 item Frequency of occurrence (A, B) 2 (A, C) 2 (B, F) Mining of 2, next whole data **-SU which added 
additional data is explained. First, it explains that the right mining result is not obtained only by totaling the mining result 
before an addition, and the mining result about additional data simply. 

[0033] If the mining result of additional data shown in the mining result and Table 8 before the addition shown in Table 4 is 
totaled, since the number of transactions will be set to 8, if it is the minimum support value 0.3, two item groups which 
frequency shows in Table 9 as 2.4 or more item groups will be obtained. 

[0034] Table 9 item The frequency of occurrence (A, C) 4 (B, E) 3 - on the other hand - additional data - beforehand -- 
original data **-SU - in addition, if mining is performed from the whole, the result which frequency shows in Table 10 as a 
group of 2.4 or more items will be obtained 

[0035] A Table 10 item The frequency of occurrence (A, B) 3 (A, C) 4 (B, C) 3 (B, E) 4 (C, E) It turns out that totaling the 
result which divided and carried out mining in five results obtained by carrying out mining on the whole only by totaling the 
mining result addition before and after an addition, and being obtained so that it may understand, if three tables 9 are compared 
with Table 10 is set only to two, and three information is lost. 

[0036] The method of this invention verifies the naming result before an addition to additional data, and compounds the mining 
result of additional data to this. This technique is explained with reference to the flow chart of drawing 4 and drawing 5 below. 
[0037] It asks as a result of [ over the data before an addition (T1D= 100-400) ] mming (i.e., the past mining result) (S3 1). This 
mining result is the same as Table 4. About these, it verifies to additional data (1TD=500-800). That is, the frequency of 
occurrence in additional data is computed (S32), and it is added to the frequency to which an item group appears in additional 
data (S33). The mining result which added the verification result comes to be shown in Table 11. 

[0038] A Table 1 1 item The frequency of occurrence (A, C) 2+2=4 (B, C) 2+1=3 (B, E) 3+1=4 (C, E) 2+1=3 (B, C, E) 2+0=2 
(A, C), (B, C), (B, E), and (C, E) are compared with the minimum support value (S34). Since these item group is beyond the 
minimum support value, these are passed to the synthetic section 24 (S35). 

[0039] Moreover, the mining result only of additional data is as having been shown in Table 8, and as shown in the following 
table 1 2, three item groups are obtained. This is passed to the synthetic section 24. 

[0040] Table 12 item Frequency of occurrence (A, B) 2 (A, C) 2 (B, F) In 2 composition sections 24, as shown in the flow 
chart of drawing 5 , the result (S41) of the new mining section 22 and the data (S42) of the verification section 23 are 
compounded, and an additional mining result is generated. In this composition, it is judged whether the rule generated exists in 
both the continuation from the mining result of past and a new mining result (S43). If this judgment is NO, it will be judged 
whether it exists only in the output of the new mining section (S44). If a rule exists in both, it will be outputted as continuation 
(S45). If a rule exists only in the new mining section, it will be outputted as a new output (S46). Continuation / new distinction 
is written together by each rule at this time. The result of composition becomes as it is shown in Table 13. 
[0041] A table 13 item group The frequency of occurrence (A, C) 4 Continuation (B, C) 3 Continuation (B, E) 4 Continuation 
(C, E) 3 Continuation (A, B) 2 New (B, F) 2 new - all of five rules found when the mining result of this addition was 
compared with the result (Table 10) which performed mining by the whole which added additional data and mining was 
performed on the whole are contained, and they are still (B, F) more newly extracted by the technique of this invention The 
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capacity for this to extract the feature continuously generated in the technique of this invention is equivalent to the result which 
performed mining at whole data **-SU, and it is shown that there is capacity to extract the feature (B, F) which is included only 
about new data in addition to it. 

[0042] Although the case where data were added only once was explained above, data are added continuously and the case 
where mining is performed to whenever [ the ] is explained. The structure of a system in this case is shown in drawing 6 . 
According to this, the initial mining system and the new mining system are shown. An initial mining system contains tiie initial 
database 3 1 and the initial mining section 32. The initial database 3 1 stores the item data of a large number collected in early 
stages, and the initial mining section 32 performs mining to early data, and it generates the early mining result 33. 
[0043] A new mining system is constituted by the additional data generating section 21, the new mining section 22, the 
verification section 23, and the synthetic section 24 like drawing 1 . According to this system, the output of the synthetic 
section 24 is used for next time as a mining result. " 

[0044] For example, when data are added once every month and mining is performed to additional data per moon, it is thought 
that remarkable dispersion exists in a monthly mining result. On the other hand, if mining is performed to whole data **-SU 
after adding data, only the rule that frequency is high will be extracted through the whole. 

[0045] In the former, in order to have extracted the rule of these both, two minings mining and the whole mining, about 
additional data needed to be performed. By the technique of this invention, it becomes possible to search for the rule that 
frequency is high, efficiently through the whole, without performing mining to the whole on the basis of mining to additional 
data. 

[0046] Then, the example in which data are continuously added to below is explained. Time which performs the first mining is 
set to 0, and suppose that there was addition of data at time 1, 2, 3, and 4, respectively. The data number of cases in time 0 and 
the number of cases of the data added in each time may be 1000 affairs, respectively. The minimum support value shall extract 
the rule of 1 00 or more frequency in 0. 1 , i.e., the data added in each time. 

[0047] As a result of performing mining of additional data about time 0-4, it is assumed that the frequency within the data 

added in each time about six sorts of rules as shown in Table 14 was obtained. 

[0048] 

table 14 Time 0 1 2 3 4 A rule 1 200 160 180 150 140 A rale 2 150 40 30 10 10 A rule 3 120 120 80 90 120 A rule 4 100 60 
1 10 70 100 Rule 580 130 120 140 150 Rule 6 40 50 150 120 If mining is performed about 90, i.e., the data added at each time 
100 or more rules will be acquired for frequency as a result. That is, an underline portion is outputted as a mining result in 
Table 14. 

[0049] Next, after adding data in each time, the case where mining is performed about the whole is explained. The frequency of 

each rule serves as an accumulation value of the frequency by the time, and becomes as it is shown in Table 15 

[0050] 

table 15 Time 0 1 2 3 4 A rule 1 200 360 540 690 830 A rule 2 150 190 220 230 240 A rule 3 120 240 320 410 530 A rule 4 
100 160 270 340 440 Rule 5 80 210 330 470 620 Rule 6 40 90240 360 450 - this case time 0 - 500 or more rules are 
outputted at 200 or more and time 2 in 100 or more and time 1 , and are outputted as a mining result at 400 or more and time 4 
in 300 or more and time 3 That is, an underline portion is outputted as a result in Table 15. 

[005 1 ] The technique of this invention is set in the synthetic section, as shown in drawing 7 , and as a mining result of each 
time, the following procedures shall generate three information, a rule, a start time, and accumulation frequency, and it shall 
save and reuse it. 

[0052] First, it is judged whether the rule is included in the accumulation mining result 33 (S51). If this judgment is YES (i.e., 
if it is the rule included in the past mining result), the frequency of the additional data of the present time will be applied to the 
accumulation frequency of the past mining result, a rule will be outputted (S54), and a start time will presuppose that it remains 
asitis(S55). 

[0053] If a judgment at Step 5 1 is NO, namely, if it is the rale which is not included in the past mining result and the frequency 
of the additional data of the present time is higher than the minimum support value, a rale will be outputted for accumulation 
frequency as frequency of the additional data of the present time (S52), and let a start time be the present time (S53). 
[0054] If this technique is applied to the above-mentioned example, the output of mining in each time will become as it is 
shown in the following table 16. 
[0055] 

table 16 A rule A start time Accumulation frequency time 0 Rule 1 0 200 A rale 2 0 150 A rale 3 0 120 A rule 4 0 100 Time 1 
Rule 1 0 200+60=360 Rule 2 0 150+40=190 Rule 3 0 120+120=240 Rule 40 100+60=160 Rule 5 1 130 time 2 Rule 10 180= 
360+540 rule 2 0190+30=220 Rule 30 240+80=320 A rale 40 160+1 10=270 A rule 5 1 130+120=250 Rule 6 2 150 time 3 rule 
1 0540+150=690 A rale 2 0 220+10=230 A rule 3 0 320+90=410 Rule 4 0 270+70=340 Rule 5 1 250+140=390 A rale 6 2 
150+120=270 Time 4 Rule 1 0 690+140=830 Rule 2 0 230+10=240 Rule 3 0 410+120=530 Rule 4 0 100= 340+440 rule 5 1 
390+150=540 Rule 6 2270+90=360 — when it does in this way, the rale which has the frequency beyond the minimum support 
value also at once in the data added in a certain time will be outputted as a mining result all the time after that That is, all the 
results obtained by carrying out mining about whole data **-SU in arbitrary time are included in this list. 
[0056] In addition, by this technique, since it increases whenever a mining result adds data, the execution time of mining may 
increase. The method of removing the rule outputted as the improvement when the ratio of accumulation frequency becomes 
below fixed is also considered. For example, if the ratio of accumulation frequency becomes 0.05 or less, supposing it will 
remove a rule from a result, a rule 2 will be removed at time 4. Such judgment is easily calculable if die number of transactions 
added at a start time and each time is held. 

[0057] (2nd operation gestalt) Although the 1st operation gestalt describes the case where data **-SU is added, usage which 
sets constant the period of the data stored in data **-SU may be carried out like the past one year. In this case, it is necessary to 
remove the data which separated from the period whenever it added new data, and to take removal into consideration also 
about maintenance of a mining result 

[0058] Below, the periodic increment mining system according to the 2nd operation gestalt of this invention is explained with 
reference to drawing 8 . 

[0059] According to the composition of drawing 8 , the mining result 41 classified by time is added to the system of drawing 6 
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. The same data as the example used with the 1st operation gestalt explain this system. That is, the same thing as Table 14 is 
used for the frequency of occurrence of the rule 1 -6 in time 0-5. 

[0060] Here, a period shall hold 3, i.e., the past 3 times of data. The mining result of whole data **-SU when setting a period to 

3 is shown in Table 17. 

[0061] 

table 17 Time 0 1 2 3 4 A rule 1 200 360 540 490 470 A rule 2 150 190 220 80 50 A rule 3 120 240 320 290 290 A nile 4 100 
160 270 240 280 rule 580 210 330 390 410 Rule 6 40 90240 Frequency is outputted at 320 360, in this case time 0, and, 
henceforth [ 200 or more and time 3 ], 300 or more rules are outputted as a mining result at 100 or more and time 1. That is, an 
underline portion is outputted as a result in the above-mentioned table 17. 

[0062] Below, in a period 3, the technique of searching for the whole mining result from the mining result of an additional 
portion and the past mining result is explained with reference to the flow chart of drawing 9 . 

[0063] Till time 2, it is the same as that of the 1 st operation gestalt, the data of time 0 are removed at the time of time 3, the data 
of time 3 are added, at time 4, the data of time 1 are deleted and the data of time 4 are added. In addition to holding the 
information which specifies the content of a rule, a start time, and accumulation frequency about the rule realized about whole 
data **-SU like the 1 st operation gestalt as a mining result, it shall hold as a result of [ about the additional data in each time / 
41 ] mining (i.e., the frequency of occurrence in the additional data of the rule which it is at the data addition-time and is 
outputted). The procedure in each time is performed as shown in the flow chart of drawing 9 . 

[0064] First, it is judged whether the rule is included in the accumulation mining result J 3 (S61). If this judgment is YES (i.e., 
if a rule is a rule included in the past mining result), it will be judged for a start time whether it is before before 1 period (S62). 
If this judgment is YES, accumulation frequency will be computed by the frequency of the frequency + present time at the 
time of the last accumulation frequency-deletion (S63). That is, the accumulation mining result of a fixed period reduces the 
mining result of the period which should delete an accumulation mining result from the mining result verified and obtained by 
additional data, and is searched for by compounding an additional mining result. A start time is made into 1 period front +1 
(S64). 

[0065] If a judgment at Step S61 is YES and a judgment at Step S62 is NO, accumulation frequency will be called for with the 

frequency of the last accumulation frequency + present time (S65), and let a start time be a value as it is (S66). 

[0066] In the rule which is not included in the past mining result if the judgment of Step S61 is NO, if the frequency in the 

additional data of existing time is higher than the minimum support value, a rule will be outputted as frequency [ in / the 

additional data of the present time / for accumulation frequency ] (S67), and let a start time be the present time (S68). 

[0067] The mining result in each time at the time of considering as the period 3 according to the above-mentioned procedure is 

shown in Table 18. 

[0068] 

table 18 A rule Start time Accumulation frequency time 0 Rule 1 0 200 A rule 2 0 150 A rule 3 0 120 A rule 4 0 100 Time 1 
Rule 1 0 200+60=360 Rule 20 40= 150+190 rule 3 0 120+120=240 A rule 4 0 100+60=160 Rule 5 1 130 time 2 Rule 10 180= 
360+540 rule 2 0190+30=220 Rule 30 240+80=320 Rule 4 0 160+1 10=270 Rule 5 1 130+120=250 Rule 6 2 150 Time 3 Rule 
1 1540+150-200=490 A rule 2 1 220+10-150=80 Rule 3 1 320+90-120=290 Rule 4 1 270+70-100=240 Rule 5 1 
250+140=390 Rule 6 2 150+120=270 Time 4 Rule 1 2 490+140-160=470 A rule 2 2 80+10-40=50 Rule 32 
290+120-120=290 Rule 4 2 240+100-60=280 Rule 5 2 390+1 50-130=410 Rule 6 2 The mining result outputted [ in / this 
method / for whether it being 90= 270+360 Ming et al. ] includes the mining result which followed whole data **-SU. 
Moreover, it is also easy to delete the rule from which frequency became below fixed from a mining result like the 1 st 
operation gestalt 

[0069] The whole mining is performed by compounding the mining result which verified and obtained the past mining result 
about additional data, and the mining result about additional data, without accessing past data **-SU according to this 
invention, as mentioned above, when there are an addition and deletion of data 
[0070] 

[Effect of the Invention] It is effective in order for mining of the whole database to become possible and to perform mining of 
large-scale data efficiently by compounding the mining result of data **-SU before adding with mining of toe data added, 
without according to this invention carrying out mining of whole data **-SU when data are added to data **-SU. 
[007 1] Moreover, it is effective in-order for mining of whole data **-SU to become possible similarly using the past mining 
result in a periodic database which deletes the data of the oldest time at the time of addition of data and to perform mining of 
large-scale data efficiently. 



[Translation done.] 



>of5 



6/9/03 1:29 PM 



http://ww4Jpdljpo.gojp/cgi-bin/lran_web_cgi_ei 



* NOTICES * 

Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

1 .This document has been translated by computer. So the translation may not reflect the original precisely. 
2.**** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] The block diagram of an incremental mining system according to 1 operation gestalt of this invention. 
[Drawing Z] The flow chart explaining the incremental mining method for obtaining the minin g result of the past of this 
invention. 

[Drawing 3] The flow chart explaining the incremental mining method for obtaining the new minin g result according to the 1 st 
operation gestalt. 

[Drawing 4] The flow chart explaining the verification section used by new mining of the 1st operation gestalt. 

[Drawing The flow chart explaining the synthetic section used by new mining of the 1 st operation gestalt. 

[Drawing bj The block diagram of the incremental mining system using an initial mining result. 

[Lfrawing / j The flow chart explaining the synthetic section in the minin g system of drawing 6 . 

[ifrawing #] The block diagram of an incremental mining system according to the 2nd operation gestalt of this invention. 

[Drawing g] The flow chart explaining the synthetic section in the minin g system of drawing 8 . 

[Description of Notations] 

11— Field database 

12 - Past mining section 

13 - The past mining result 

21 -- Additional data section 

22 — New mining section 

23 — Verification section 

24 -- The synthetic section 

31 Initial database 

32 - Initial mining section 
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[0050] 

tm 0 

mmi 2oo 

mm 2 i5o 

m®3 120 

mm 4 loo 
mm5 so 
mm 6 4 o 

CO^li. ^ffJOTlOOtU:. ^JlT2 0 0Ja 
±. ^J2t'3 0 0iiLh. f*m?40 0&±^ #M4 

b. ?%bt>. mi 5xrmmimmt Lxmuzti 
i. 

[oo5i] ^mmmmi.m7 t^tx d iz$$m 
keut, mmwj-yrmt lt, m nt* 

[ o o 5 2 ] amswiir^-f -y^Ti»3 3 e# 

£ftTV^*>a*|3|5££fl6 ( S 5 1 ) . ZCVWMWYE 

Scl6 

«mo mmi o 

MM 2 0 
SH"J3 o 
«H4 0 



*1 5 



1 


2 


3 


4 


360 


540 


69 0 


830 


190 


220 


23 0 


240 


240 


320 


4 1 0 


530 


160 


270 


340 


440 


2 10 


330 


47 0 


620 


90 


240 


36 0 


450 



mmm?--?<mm&mxmmmjji (ss 
4 ) . mMffl±**)±& k-rt ( s 5 5 ) . 

[ 0 0 5 3 ] Xf775 1 "COflje*«NOT2MUf . BP 

mmzaij} t ( s 5 2 > . 

(S53) . 

[0054] .ro^££±fso0!icjUJ!T-& t % mm 

[0055] 

mmm. 
200 

1 50 
120 
100 



mmi mi 0 

mm 2 0 

«H3 0 

0 

mm5 1 

mm2 mmi 0 

mm2 0 

SIB 0 

mm 4 0 

SM'I5 1 

SI? 6 2 

mas 3 sin 0 

&I'I2 0 

813 0 

mnu 0 

mm 5 1 

8116 2 

^4 mmi 0 

£H2 0 

8U3 0 



200+60 = 


360 


15 0+4 0 = 


190 


120 + 120 


= 240 


100+60 = 


160 


130 




360+180 


= 540 


190+30 = 


220 


240+80 = 


320 


160+110 


= 270 


130+120 


= 250 


150 




540+1 50 


= 690 


220+1 0 = 


230 


320+90 = 


4 10 


270+70 = 


340 


250+140 


= 390 


150+120 


= 270 


690+140 


= 830 


230+10 = 


240 


4 10+120 


= 530 
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mu 0 
m\5 1 
me 2 

Z<7)£o£-f&t. fiimifc&^X&MZiliT-? 

fm-r-yt^-yrmMkb ixftuztiizt 

[0056] %ti, **&1XX?it->?l&UH c -* 

it-tn. mmx- 
[0057] (9i2 oswkios i amsmxuT- 

B#£J 0 

mi 200 

1_50 

SBB 120 
ffliJ4 i00 
SD5 8 0 

me 40 

dSfltett, ^j0-C^J^10 0J3Lh. ^J1T2 0 

out. mmmx'itsoovitwMMff^j-yy 

t£%t LX&JjZtli. t%t>h. IMi^Ml 7TTI8 
[0062] feTFT'liJUSB fcfcWC , 

[ 0 0 6 3 ] J^J 2 iTttHS 1 |S|-TJ> 
0. B8fl3«0fcf Kl$8l0tf>T-?£&*LTB*8l3tf> 
t-?£5UiiU ^4T<±^Ji<0r-^^»tT 
m4<r>T-?ZMst&. -?J-y?%%i:lXte. 
m 1 ?)ISJ6J$$fc PWteT-**-*£ft*:-ovvCJft 9 
AoafflfcHLTJiBIrt** Rtt«*f. nana**** 

^Tai* $ ix£ «IO^)UDr- * tt & aj3i«S2r 
-f-v - h fcj^S tiiXolz'ifo. 

i o o 6 4 ] j&r , siijtf&m^ -ym%3 3 

4tltV^3&»*CH3g$H6 (S61) . C<D«£#YE 



340+1 00 = 440 
390+150 = 540 
270 + 90 = 360 

[0058] JSTFt, *^BJ^m 2 tfJUSUBJgfcft-?*: 
JWHW5T4 y^»;p<yhv-fc:y^xr^5r08$r# 

[00 59]H8*>«MMcJt*k, 06cDvXfAt^ 

jajjT7>r-y^^tjll4l* t f«nS<i"Cv^. ccoxxr 
A£gS 1 wiat^WCfflv^fiFlk m lT-*XWMt 

i. -ttthh, nm-s^ammi-e^mm. 
[0060] azx. mm±3. -r^*>j§*30<^-r 
fK-x-ZfowJ -y?&%zm 7fc^f . 

[0061 ] 

mi 7 



1 


2 


3 


4 


360 


540 


490 


470 


190 


220 


80 


50 


240 


320 


29 0 


290 


160 


270 


24 0 


280 


210 


330 


390 


410 


90 


240 


320 


360 



feZili ( S 6 2 ) . ^c^PBgtfYE SX'fotlK. 
X%ti&tli (S6 3) . bp*>, 

mLx®t>tL&-?4 -yyi^frmmt^imvr? 
i-yy&mw^ mm^J-y-rt&mi^fcti 
znzxi xm>t>m . mmmt i mmm + 1 1 $ 

ill ( S 6 4 ) . 

[ o o 6 5 1 xf/rs 6 i x-m i^'ye s-c* o . 

;*Ty7S6 2tN0W£#NOtN&*Uf. JRflBM&WS 
McO^SSS+SmM^SSt 1 T#tf> t>tl ( S 6 

5 ) , m^mmi±-t<ot zwmt ztn < s 6 6 > . 

[0066] Xf"/7S 6 1 <rm%tfKOXbtl\i . S 

&tLxmm\&nL (S6 7) , ntti»9»mftp»fl 

ttS (S68) . 

[0067] 1I#W^>:W3 1 L^:%&<7)# 
^^{t&T-f-y/MmSra 1 8^-r. 

[0068] 
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^18 






mm 

0 




443 Rf I O 


0 




443f3lf O 


0 






0 




443t3if 1 

SHU 


0 




443t3lt /^i 


0 






rv 
U 






U 






1 




mm l 


U 




SOW 2 


0 




mm3 


0 






U 




443 nit t— 

JS805 


1 




443 nil v 


2 




443 Oil -1 


1 




«H2 


1 






1 






1 






1 






2 




443 till <• 


2 




&B'J2 


2 




&I'J3 


2 




MM4 


2 




mm 5 


2 






2 



[00 69] J^Oi^fc^^tciSi:, r-^^ii 
[0070] 

i^-^K-x±m^^-yyiMtih.tc^. *« 



200 
1 50 
120 
100 

200+60 = 360 
1 50+40 = 1 90 
120 + 120 = 240 
100+60=160 
130 

360 + 180 = 540 
1 90 + 30 = 220 
240 + 80 = 320 
160+1 10 = 270 
1 30+120=2 50 
150 

540+150-200=490 
220+10-150=80 
320+90-120=290 
270+70-100=240 
250 + 140 = 390 
1 50+120=270 

490+140-160=470 
80+1 0-40 = 50 
290+1 20-1 20 = 290 
240+100-60=280 
390+1 50-1 30 = 4 1 0 
270 + 90 = 360 
<£>•?>. 

[00713 7—?<7)mtomzi>'>bi>-&^mM 

[®i 3 *lKMWi:i^^ y? iMy*^ 

[02 3 *m&>&£<0^-y7&%*®lt:#><r>-4 
y?o *y?)V?4 - y trjm&wnth ? u-**~- 

K 

[03 3 % i ommmizfe^mmJ -ymmz 

[04 3 m coH8©g®cofr«-7>f -y^flu&tsa 

[05 j ss i <ommm<mm^ -y-rTm-rza 
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?4 -yy-yXT^CDfa -y 70. 1 3- 

[07 ] mw?4 -Wi'ATMZi}\H>^fm*U 2 1- 

BJtf&7Vl-* J r-h. 2 2- 

[08] *mo%2nmimBizftiM>'7vx> 23- 

^^7^-y/^f^/n77l. 24- 

i®9]m8<r>-?4->'7' i sXTMztm&£imzu 3i- 

Bfft2,70-^-v-b. 3 2- 
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S21 
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S25 
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[H5] 



SB I 



V S11 
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*7 
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y S45 
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»7 
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S52 



S53 
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[H8] 



31 


32 


an?-*"*-* 









21 



22 




[09] 




NO 



if 8 



Biswas - 
a&ttftijoiiK 



1 



t~S83 
S64 



S65 
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